Validating and Refining Clusters via Visual Rendering

نویسندگان

  • Keke Chen
  • Ling Liu
چکیده

Clustering is an important technique for understanding and analysis of large multi-dimensional datasets in many scientific applications. Most of clustering research to date has been focused on developing automatic clustering algorithms or cluster validation methods. The automatic algorithms are known to work well in dealing with clusters of regular shapes, e.g. compact spherical shapes, but may incur higher error rates when dealing with arbitrarily shaped clusters. Although some efforts have been devoted to addressing the problem of skewed datasets, the problem of handling clusters with irregular shapes is still in its infancy, especially in terms of dimensionality of the datasets and the precision of the clustering results considered. Not surprisingly, the statistical indices works ineffective in validating clusters of irregular shapes, too. In this paper, we address the problem of cluster rendering of skewed datasets by introducing a series of visual rendering techniques and a visual framework (VISTA). A main idea of the VISTA approach is to capitalize on the power of visualization and interactive feedbacks to encourage domain experts to participate in the clustering revision and clustering validation process. The VISTA system has two unique features. First, it implements a linear and reliable mapping model to visualize k-dimensional data sets in a 2D star-coordinate space. Second, it provides a rich set of userfriendly and yet effective interactive rendering operations, allowing users to validate and interactively refine the cluster structure based on their visual experience as well their domain knowledge.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

VISTA: validating and refining clusters via visualization

Clustering is an important technique for understanding of large multi-dimensional datasets. Most of clustering research to date has been focused on developing automatic clustering algorithms and cluster validation methods. The automatic algorithms are known to work well in dealing with clusters of regular shapes, e.g. compact spherical shapes, but may incur higher error rates when dealing with ...

متن کامل

VISTA: Validating and Refining Clusters via Visualization (final version)

Clustering is an important technique for understanding of large multi-dimensional datasets. Most of clustering research to date has been focused on developing automatic clustering algorithms and cluster validation methods. The automatic algorithms are known to work well in dealing with clusters of regular shapes, e.g. compact spherical shapes, but may incur higher error rates when dealing with ...

متن کامل

Optimizing star-coordinate visualization models for effective interactive cluster exploration on big data

Interactive visual cluster analysis is the most intuitive way for finding clustering patterns, validating algorithmic clustering results, understanding data clusters with domain knowledge, and refining cluster definitions. The most challenging step is visualizing multidimensional data and allowing a user to interactively explore the data to identify clustering structures. In this paper, we syst...

متن کامل

Refining membership degrees obtained from fuzzy C-means by re-fuzzification

Fuzzy C-mean (FCM) is the most well-known and widely-used fuzzy clustering algorithm. However, one of the weaknesses of the FCM is the way it assigns membership degrees to data which is based on the distance to the cluster centers. Unfortunately, the membership degrees are determined without considering the shape and density of the clusters. In this paper, we propose an algorithm which takes th...

متن کامل

Perceptually Driven Point Sample Rendering

> Motivation = Visible difference between renderings from refined geometry and coarser geometry is not constant over the image plane (see figure below) = The computational effort for refining should be aimed to the features, which cause highest difference > Problem = Evaluation of visible differences is too expensive to be done in realtime > Our strategy = Perceptual analysis is done as a prepr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003